ARRAYED BIOMOIJICUIJES AND THEIR USE IN SEQUENCING 

jy fereyice to Related Application 

This Application is a combination (in p4n of PCT/GB99/G2487, fijed July 30, 

1999 

Field of The Invention 

This invention relates to fabricated arrays of molecules, and to their analytical 
applications. In particular, this invention relates to the use of fabricated arrays in 
methods for obtaining genetic sequence information 
Background of the Invention 

Advices ki ike study of molecules have been led, in part, by improvement in 
technologies used to characterise the molecules or their biological reactions. In 
particular, the study of nucleic acids, DNA and RNA, has benefited from developing 
technologies used for sequence analysis and the study of hybridisation events. 

An example of the technologies that have improved the study of nucleic acids, is 
the development of fabricated arrays of immobilised nucleic acids. These arrays typically 
consist of a high-density matrix of polynucleotides immobilised onto a solid support 
material Fodor et al. t Trends in Biotechnology (1994) ^2. 19-26, describes ways of 
assembling the nucleic acid arrays using a chemically sensitised glass surfece protected 
by a mask, but exposed at defined areas to allow attachment of suitably modified 
nucleotides Typically, these arrays may be described as "many molecule" array v as 
distinct regions are formed on the solid support comprising a high density of one specific 
type of polynucleotide. a 

An alternative approach is described by Schena et al, Science (1995) 270,467- 
470, where samples of DNA are positioned at predetermined sites on a glass microscope 
slide by robotic micropipetting techniques. The ON A is aitacbed to the glass surfece 
along us entire length by non-covalem electrostatic interactions. However; although 
hybridisation with complementary DNA sequences can occur, this approach may not 
permit the DNA to be freely available for mteracting with other components such as 
polymerase enzymes, DN A-binding proteins etc. 

Recently, the Human Genome Project determined the entire sequence bf the 
human genome- all 3x1 tf'bases. The sequence information represents that of an average 
human However, there is still considerable interest in identifying differences in the 
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generic sequence between different individuals. The most common form of genetic 
variation is single nucleotide polymorphisms (SNPs) On average one base in 1 000 is a 
SNP, which means that there are 3 million SN?s for any individual Some of the SNPs 
are in coding regions and produce proteins with different binding affinities or properties 
s Some are m regulatory regions and result in a different response to changes in levels of 
metabolites or messengers. SNPs are also found in non-coding regions, and these are 
also important as they may correlate with SNPs in coding or regulatory regions. The key 
problem is to develop a low cost way of determining one or more of the SNPs for an 
individual 

l o The nucleic acid arrays may be used to determine SNPs, and they have been used 

to study hybridisation events (Mirzabekov, Trends in Biotechnology (1994) 12-27-32) 
Many of these hybridisation events are detected using fluorescent labels attached to 
nucleotides, the labels being detected using a sensitive fluorescent detector, e.g. a charge- 
coupled detector (CCD). The m$jor disadvantages of these methods are that it is not 

15 possible to sequence long stretches of DNA, and that repeat sequences can lead to 
ambiguity in the results. These problems are recognised in Automation Technologies for 
Genome Charaaerisauon, Wiley-biterscience(l997) 3 ed T J Beugdsdijk, Chapter 10- 
205-225 

In addition, the use of high-density arrays in a multi-step analysis procedure can 
20 lead to problems with phasing. Phasing problems result from a loss in the 
synchronisation of a reaction step occurring on different molecules of the array If some 
of the arrayed molecules fail to undergo a step in the procedure, subsequent results 
obtained for these molecules will no longer be in step with results obtained for the other 
arrayed molecules. The proportion of molecules out of phase will increase through 
25 successive steps and consequently the results detected will become ambiguous "This 

v> .... 

problem is recognised in the sequencing procedure described in US-A-5302509. 

An alternative sequencing approach is disclosed in EP-A-03S 1693; which 
comprises hybridising a fluorescently-labelled strand of ON A to a target DN A sample 
suspended in a flowing sample stream, and then usmg an exonuclease to cleave 
3 o repeatedly the end base from the hybridised DNA The cleaved bases are detected in 
sequential passage through a detector, allowing reconstruction of the base sequence of 
the DNA. Each of the different nucleotides has a distinct fluorescent label attached, 



which is detected by laser-induced fluorescence This is a complex method, primarily 
because it is difficult to ensure that every*nucleatide of the DN A strand is labelled and 
That this has been achieved with high fidelity to the original sequence 

WO-A-96/27Q25 is a general disclosure of angle molecule arrays Although 
sequencing procedures are disclosed, there is little de$cription of the applications to 
which the arrays can be applied. There is also only a general discussion on how to 
prepare the arrays 
Summary of the Invention 

According to the present invention, a device comprises a high density array of 
molecules capable of interrogation and immobilised on a solid generally planar surface* 
wherein the atTay allows the molecules to be individually resolved by optical microscopy, 
and wherein each molecule is immobilised by covalent bonding to the surface, other than 
at that pan of each molecule that can be interrogated. 

According to a second aspect of the invention, a device comprises a high density 
array of relatively short molecules and relatively long polynucleotides immobilised on the 
surface of a solid support, wherein the polynucleotides are at a density that permits 
individual resolution of those parts that extend beyond the relatively short molecules In 
this aspect, the shorter molecules can prevent non-specific binding of reagents to the 
solid support, and therefore reduce background interference 

According to a third aspect of the invention, a device comprises an array of 
polynucleotide molecules immobilised on a solid sur&ce, wherein each molecule 
comprises a polynucleotide duplex linked via a covalent bond to form a hairpin loop 
structure, one end of which comprises a target polynucleotide, and the array has a surface 
density which allows the target polynucleotides to be individually resolved In this 
aspect, the hairpin structures act to tether the target to a primer polynucleotide This 
prevents Joss of the primer-target dunng the washing steps of a sequencing procedure 
The hairpins may therefore improve the efficiency of the sequencing procedures. 

The arrays of the present invention comprise what are effectively sipgle 
molecules. This has many important benefits for the study of the molecules and their 
interaction with other biological molecules. In particular, fluorescence events occurring 
on each molecule can be detected using an optical microscope linked to a sensitive 
detector, resulting in a distinct signal for each molecule 
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When used in a multi-step analysis of a population of single molecules, the 
phasing problems that are encountered using high density (multi-molecule) arrays of the 
prior an, can be reduced or removed Therefore, the arrays also permit a massively 
parallel approach to monitoring fluorescent or other events on the molecules. Such 
5 massively parallel data acquisition makes the arrays extremely useful in a wide range of 
analysis procedureswhichinvolve the screening/characterising of heterogeneous mixtures 
of molecules. The arrays can be used to characterise a particular synthetic chemical or 
biological moiety, for example in screening for particular molecules produced in 
combinatorial synthesis reactions. 

io The arrays of the present invention are particularly suitable for use with 

polynucleotides as the molecular species The preparation of the arrays requires only 
small amounts of polynucleotide sample and other reagents, and can be carried out by 
simple means Polynucleotide arrays according to the invention permit massively parallel 
sequencing chemistries to be performed For example, the arrays permit simultaneous 

15 chemical reactions on and analysis of many individual polynucleotide molecules. The 
arrays are therefore very suitable for determining polynucleotide sequences. 

An array of the invention may also be used to generate a spatially addressable 
array of single polynucleotide molecules This is the simple consequence of sequencing 
the array. Particular advantages of such a spatially addressable array include the 

20 following 

1) Polynucleotide molecules on the array may act as identifier tags and may only 
need to be 10-20 bases long, and the efficiency required in the sequencing steps may only 
need to be better than 50%, as there will be no phasing problems 

2) The arrays may be reusable for screening once created and sequenced .All 

2 5 possible sequences can be produced in a very simple way, e.g compared to abigh density 

multi-molecule DMA chip made using photolithography 
Pescripiion of the Drawings 

Figure 1 is a schematic representation of apparatus that may be used to image 
arrays of the present invention; 

3 o Figure 2 illustrates the unmobflisation of a polynucleotide to a solid surface via 

a microsphere; 



figure 3 shows a fluorescence time profile from a single fluoropbore-labeJled 
oligonucleotide, with excitation at 514ron and detection at 60Gnm, 

Figure 4 shows flaorescentjy labelled single molecule DN A covalently attached 
to a solid surface, and 

Figure 5 shows images of surface bound oligonucleotides hybridised with the 
complementary sequence. 
Description of the Invention 

According to the present invention, the single molecules immobilised onto the 
surface of a solid support should be capable of being resolved by optical means. This 
means that, within the resolvable area of the particular imaging device used, there must 
be one or more distinct images each representing one molecule. Typically, the molecules 
of the array are resolved using a single molecule fluorescence microscope equipped with 
a sensitive detector, e g. a charge-coupled detector (Cd>). Each molecule of the array 
may be analysed simultaneously or, by scanning the array, a ftst sequential analysis can 
be performed 

The molecules of the array are typically ON A, KNa or nucleic acid mimics, e g 
FNA or 2'-Q-Meth-RNA However, any other biomolecules, including peptides, 
polypeptides and other organic molecules, may be used The molecules are formed on 
the array to allow interaction with other "cognate" molecules. It is therefore important 
to immobihse the molecules so that the portion of the molecule not physically attached ■ 
to solid support is capable of being interrogated by a cognate. In some applications all 
the molecules m the single airay will be the same, and may be used to interrogate 
molecules that are largely distinct In other applications, the molecules on the array may 
all, or substantially all, be different, e.g. less than 50%, preferably less than 30% of the 
molecules will be the same. 

The term "single molecule*' is used herein to distinguish from high density multi- 
molecule arrays m the prior art, which may comprise distinct dusters of many molecules 
of The same type. y? - 

The term "individually resolved" is used herein to indicate that, when visualised, 
it is possible to distinguish one molecule on the array from its neighbouring molecules. 
Visualisation may be effected by the use of reporter labels, e.g fluorophores, the signal 
of which is individually resolved. 



The terra "cognate molecule" is used herein to refer to any molecule capable of 
interacting, or interrogating, the arrayed molecule The cognate may be a molecule that 
binds specifically to the arrayed molecule, for example a complementary polynucleotide, 
in a hybridisation reaction. 

The term "interrogate" is used herein to refer to any interaction of the arrayed 
molecule with any other molecule The interaction may be covalent or non-covalent. 

The terms "arrayed polynucleotides" and "polynucleotide arrays" are used herein 
to define a plurality of single molecules that are characterised by comprising a 
polynucleotide The term is intended to include the attachment of other molecules to a 
solid surface, the molecules having a polynucleotide attached that can be further 
interrogated. For example, the arrays may comprise protein molecules immobilised on 
a solid surface, the protein molecules being conjugated or otherwise bound to a short 
polynucleotide molecule that may be interrogated, to address the array. 

The density of the arrays is not critical However, the present invention can make 
use of a high density of single molecules, and these are preferable. For example, arrays 
with a density of lOMo* molecules per cm 2 may be used. Preferably, the density is at 
least 10 7 /cm 2 and typically up to lO'Vcnr. These high density arrays are in contrast to 
other arrays which may be described in the an as "high density 11 but which are not 
necessarily as high and/or which do not allow single molecule resolution 

Using the methods and apparatus of the present invention, it may be possible to 
image at least 10 7 or 10* molecules simultaneously. Fast sequential imaging may be 
achieved using a scanning apparatus; shifting and transfer between images may allow . 
higher numbers of molecules to be imaged. 

The extent of separation between the individual molecules on the array will b$ ; 
determined, in pan, by the particular technique used to resolve the individual molecule. 
Apparatus used to image molecular arrays are known to those skilled in the an For 
example, a confocal scanning microscope may be used to scan the surface oftheajray ; ; 
with a laser to image directly a fluorophore incorporated on the individual molecule by 
fluorescence. This may be achieved using the apparatus illustrated in Fig l,fig 1 shows 
a detector 1, a bandpass filter 2 ? a pinhole 3, a mirror 4, a laser beams 5, a dichroic mirror 
6, an objective 7, a glass coverslip S and a sample 9 under study. Alternatively, a 
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sensitive 2-D detector, such as a charge-coupled detector, can be used to provide a 2-D 
image representing the individual molecules on the array 

Resolving single molecules on the array with a 2-0 detector can be done if, at 1 00 
xl magnification, adjacent molecules are separated by a distance of approximately at least 
5 250nm, preferably at least 300nm and more preferably at least 350nm It will be 
appreciated that these distances are dependent on magnification, and that other values 
can be determined accordingly, by one of ordinary skill in the art 

Other techniques such as scanning near-field optical microscopy (SNOM) are 
available which are capable of greater optical resolution, thereby penmtting more dense 
l o arrays to be used. For example, using SNOM, adjacent molecules may be separated by 
a distance of less than i OOnm, e g lOnm For a description of scanning near-field optical 
microscopy, see Mayer et <*/., Laser Focus World (1993) 29(10) 

An additional technique that may be used is surface-specific total internal 
reflection fluorescence microscopy (TTRFM); see a for example, Vale es al > Nature, 
15 (1996)380.451*453). Using this technique ,it is possible to achieve wide-field imaging 
(up to 100 nm x 100 ^m) with single molecule sensitivity This may allow arrays of 
greater than 10 7 resolvable molecules per cm 3 to be used. 

Additionally, the techniques of scanning tunnelling microscopy (Binnig er oil , 
Helvetica Physica Acta ( 1 982) 5 5 :726-73 5) and atomic force microscopy (Hansmae/of, 
20 Ann. Rev Biophys Biomol Struct (1994) 23 115-139) are suitable for imaging the 
arrays of the present invention. Other devices which do not rely on microscopy may also 
be used, provided that they are capable of imaging within discrete areas on a solid 
support. 

Single molecules may be arrayed by immobilisation to the surfece of a solid 
25 suppon This may be carried out by any known technique, provided that suitable 
conditions are used to ensure adequate separauoivof the molecules Generally the array 
is produced by dispensing small volumes of a sample containing a mixture of molecules 
onto a suitably prepared solid surface, or by applying a dilute solution to the solid surface 
to generate a random array In this manner, a mixture of different molecules may be 
3 0 arrayed by simple means The formation of the single molecule array then permits ; 
interrogation of each arrayed molecule to be earned out 
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Suitable solid supports are available commercially, and will be apparent to the 
skilled person The supports may be manufactured from materials such as glass, 
ceramics, silica and silicon. The supports usually comprise a flat (planar) surface, or at 
least an array in which the molecules to be interrogated are in the same plane. Any 
5 suitable size may be used. For example, the supports might be of the order of 1-lQ cm 
in each direction. 

It is importanr to prepare the solid support under conditions which minimise or 
avoid the presence of contaminants The solid support must be cleaned thoroughly, 
preferably with a suitable detergent, eg. Decon-90, to remove dust and other 
10 contaminants. 

Immobilisation may be by specific covaleui or non-covalem interactions. 
CovaJent attachment is preferred If the molecule is a polynucleotide, immobilisation will 
preferably be at either the 5' or 3' position, so that the polynucleotide js attached to the 
solid support at one end only. However, the polynucleotide may be attached to the solid 

15 support at any position along its length, the attachment acting to tether the 
polynucleotide to the solid support. The immobilised polynucleotide is then able to 
undergo interactions with other molecules or cognates at positions distant from the solid 
support Typically the interaction will be such that it is possible to remove any molecules 
bound to the solid support through non-specific interactions^ e.g by washing 

20 Immobilisation m this manner results in well separated single molecules The advantage 
of this is that it prevents interaction between neighbouring molecules on the array, which 
may hinder interrogation of the array 

In one embodiment of the invention, the surface of a solid support is first coated 
with streptavidin or avidin, and then a dilute solution of a biotinylated molecule is added : 

25 at discrete sixes on the sur&ce using, for example, a nanolitre dispenser to deliver one 
molecule on average to each site ; 

In a preferred embodiment of the invention, the solid surface is coated with an 
epoxide and the molecules are coupled via an amine linkage. It is also preferable to avoid 
or reduce salt presenr in the solution containing the molecule to be arrayed Reducing 

3 o the salt concentration mhumises the possibility of the molecules aggregating in the 
solution, which may affect the positioning on the array. 



9 



If the molecule is a polynucleotide, then immobilisation may be via hybridisation 
to a complementary nucleic acid molecule previously attached to a solid support For 
example, the surface of a solid support may be first coated with a primer polynucleotide 
at discrete rites on the surface Single-stranded polynucleotides are then brought into 
5 contact with the atrayed primers under hybridising conditions and allowed to "self-sort" 
onto the array. In this way, the arrays may be used to separate the desired 
polynucleotides from a heterogeneous sample of polynucleotides 

Alteroauvely, the arrayed primers may be composed of double-stranded 
polynucleotides with a single-stranded overhang ("sticky-ends")- Hybridisation with 
1 0 target polynucleotides is then allowed to occur and a DN A ligase used to covalently link 
the target DNA to the primer. The second DNA strand can then be removed under 
melting conditions to leave an arrayed polynucleotide 

In an embodiment of the invention, the target molecules are nnmobiiised onto 
non-fluorescent streptavidin or avidin-fimctionalised polystyrene latex microspheres, as 
15 shown in Fig. 2, Fig 2 shows a microsphere H s a streptavidin molecule 12, a biotin 
molecule 13 and a fluorescently labelled polynucleotide 14. The microspheres are 
immobilised in turn onto a solid support to fix the target sample for microscope analysis. 
Alternative microspheres suitable for use in the present invention are well known in the 
art 

20 Ih one aspect of the present invention, the devices comprise arrayed 

polynucleotides, each polynucleotide comprising a hairpin loop structure, one end of 
which comprises a target polynucleotide, the other end comprising a relatively short 
polynucleotide capable of acting as a primer in the polymerase reaction. This ensures 
that the primer is able to perform its priming function during a polymerase-based 

2 5 sequencing procedure, and is not removed during any washing step in the procedure. 
The target polynucleotide is capable of being interrogated 

The term "hairpin loop structure" refers to a molecular stem and loop structure 
formed from the hybridisation of complementary polynucleotides that are covalently 
linked. The stem comprises the hybridised polynucleotides and the loop isthe region that - 

30 covalently links the two complementary polynucleotides Anything from a 10 to 20 (or 
more) base pair double-stranded (duplex) region may be used to form the stem In one 
embodiment, the structure may be formed from a single-stranded polynucleotide having 
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complementary regions. The loop in this embodiment may be anything from 2 or more 
non-hybridised nucleotides In a second embodiment, the structure is formed from two 
separate polynucleotides with complementary regions, the two polynucleotides being 
linked (and the loop being at least partially formed) by a linker moiety. The linker moiety 
5 forms a covalem attachment between the ends of the two polynucleotides. Linker 
moieties suitable for use in this embodiment will be apparent to the skilled person For 
example, the linker moiety may be polyethylene glycol (PEG). 

There are many different ways of forming the hairpin structure to incorporate the 
target polynucleotide However, a preferred method is to form a first molecule capable 
1 0 of forming a hairpin structure, and Egate the target polynucleotide to this. Ligation may 
be earned out either prior to or after immobilisation to the solid support. The resulting 
structure comprises the single-stranded target polynucleotide at one end of the hairpin 
and a primer polynucleotide at the other end 

In one embodiment, the target polynucleotide is genomic DNA purified using 
is conventional methods. The genomic DNA may be PCR-amplified or used directly to 
generate fragments of DNA using either restriction endonucleases, other suitable 
enzymes, a mechanical form of fragmentation or a non-enzymatic chemicalfragmentation 
method. In the case of fragments generated by restriction endouucleases, hairpin 
structures bearing a complementary restriction sire at the end of the first hairpin may be . 
2 o used, and selective ligation of one strand of the DMA sample fragments may be achieved : 
by one of two methods 

Method 1 uses a first hairpin whose restriction site contains a phosphotylated 5' 
end. Using this method* it may be necessary to first de-phosphorylate the restriction- 
cleaved genomic or other DNA fragments prior to ligation such that only one sample 
25 strand is covalently Ugated to the hairpin. 

Method 2. in the design of the hairpin, a single (or more) base gap can be 
incorporated at the 3' end (the receded strand) such that upon ligation of the DNA 
fragments only one strand is covalently joined to the hatrpro. The base gap can be formed 
by hybridising a iurther separate polynucleotide to the 5'-end of the first hairpin structure;; 
30 On ligation, the DNA fragment has one strand joined to the 5'-end of the first hairpin, and 
the other strand joined to the 3'~end of the further polynucleotide. The further 



polynucleotide {and the other strand of the JDNA fragment) may Then be removed by 
disrupting hybridisation 

In either case, the net result should be covalent liganon of only one strand of a 
DNA fragment of genomic or other DNA, xo the hairpin. Such ligation reactions may 
be carried out in solution at optimised concentrations based on conventional ligation 
chemistry, for example, carried out by DNA ligases or non-enzymatic chemical ligation 
Should the fragmented DNA be generated by random shearing of genomic DNA or 
polymerase, then the ends can be filled in with Klcno w fragment to generate blunt-ended 
fragments which may be blunt-end-ligated onto blunt-ended hairpins Alternatively, the 
blunt-ended DNA fragments may be ligated to oligonucleotide adapters which are 
designed to allow compatible ligation with the sncjcy-end hairpins, in the manner 
described previously 

The hahpin-ligated DNA constructs may then be covatently attached to the 
surface of a solid support to generate a single molecule array (SMA), or ligation may 
follow attachment to foim the array. 

The arrays may then be used in procedures to determine the sequence of the 
target polynucleonde. If the target fragments are generated via restriction digest of 
genomic DNA, the recognition sequence of the restriction or other nuclease enzyme will 
provide 4, 6, 8 bases or more of known sequence (dependent on the enzyme) Further 
sequencing of between 10 and 20 bases on the SMA should provide sufficient overall 
sequence information to place that stretch of DNA into unique context with a total 
human genome sequence, thus enabling the sequence information to be used for 
genotyping and more specifically single nucleotide polymorphism (SNP) scoring. 

Simple calculations have suggested the following based on sequencing a lQ 7 
molecule SMA prepared from hairpin ligation, for a 6 base pair recognition sequence, i 
single restriction enzyme will generate approximately 10* ends of DNA. If a stretch of 
1 3 bases is sequenced on the SMA (i.e 1 3 x 1 0 tf bases), approximately 1 3,000 SNPs will 
be detected. One application of such a sample preparation and sequencing format would 
in general be for SNP discovery in pharoiaco-genet ic analysis The approach is therefore : 
suitable for forensic analysis or any other system which requires unambiguous 
identification of individuals to a level as low 10 3 SNPs. 
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It is of course possible to sequence the complete Target polynucleotide, if 
required. 

In a separate aspect of the invention, the devices may comprise immobilised 
polynucleotides and other immobilised molecules. The other molecules are relatively 
shon compared to the polynucleotides and are intended to prevent non-specific 
attachment of reagents, e.g fluorophares, with the solid support, thereby reducing 
background interference In one embodiment, the other molecules are relatively shon 
polynucleotides. However, many different molecules may be used, eg. peptides, 
proteins, polymers and synthetic chemicals, as will be apparent to the skilled person 
Preparation of the devices may be carried out by first preparing a mixture of the relatively 
long polynucleotides and of the relatively short molecules. Usually, the concentration of 
the latter will be in excess of that of the long polynucleotides. The mixture is then placed 
in contact with a suitably prepared solid support, to allow immobilisation to occur. 

The single molecule arrays have many applications in methods which rely on the 
detection of biological or chemical interactions with airayed molecules. For example, the 
ancays may be used to determine the properties or identities of cognate molecules. 
Typically, interaction of biological or chemical molecules with the arrays are carried out 
in solution 

In particular, the arrays may be used in conventional assays which rely on the 
detection of fluorescent labels 10 obtain information on the arrayed molecules/ The 
arrays are particularly suitable for use in multi-step assays where the loss of 
synchronisation in the steps was previously regarded as a limitation to the use of arrays 
When the arrays are composed of polynucleotides they may be used in conventional 
techniques for obtaining genetic sequence information Many of these techniques rely 
on the stepwise identification of suitably labelled nucleotides, referred to in DS-A- 
5634413 as "single base" sequencing methods. 

In an embodiment of the invention, the sequence of a target polynucleotide is 
determined in a similar manner to that described in US-A-5634413, by detecting the 
incorporation of nucleotides into the nascent strand through the detection of a fluorescent 
label attached to the incorporated nucleotide. The target polynucleotide is primed with 
a suitable primer (or prepared as a hairpin construct which will contain the primer as pan 
of the hairpin), and the nascent chain is extended in a stepwise manner by the polymerase 
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reaction. Each of the different nucleotides (A, T, G and C) incorporates a unique 
fluorophore at the 3* position which acts as a blocking group to prevent uncontrolled 
polymerisation The polymerase enzyme incorporates a nucleotide into the nascent chain 
complementary to the target, and the blocking group prevents further incorporation of 
nucleotides. The array surface is then cleared of unincorporated nucleotides and each 
incorporated nucleotide is "read" optically by a charge-coupled detector using laser 
excitation and filters The 3' -blocking group is then removed (deprotected), to expose 
the nascent chain for further nucleotide incorporation 

Because the array consists of distinct optically resolvable polynucleotides, each 
target polynucleotide will generate a series of distinct signals as the fluorescent events 
are detected. Derails of the full sequence are then determined. 

The number of cycles that can be achieved is governed principally by the yield of 
the deprotection cycle Jf deprotection fails in one cycle, it is possible that later 
deprotection and continued incorporation of nucleotides can be detected during the next 
cycle. Because the sequencing is performed at the single molecule level, the sequencing 
can be carried out on different polynucleotide sequences at one time without the 
necessity for separation of the different sample fragments prior to sequencing This 
sequencing also avoids the phasing problems associated with prior an methods 

Deprotection may be earned out by chemical, photochemical or enzymatic 
reactions, ^ : 

A similar, and equally applicable, sequencing method is disclosed in EP-A- 
0640146 

Other suitable sequencing procedures will be apparent to the skilled person In 
particular, the sequencing method may rely on the degradation of the arrayed 
polynucleotides, the degradation products being characterised to determine the sequence. 

An example of a suitable degradation technique is disclosed in WO-A- 95/20053, 
whereby bases on a polynucleotide are removed sequentially, a predetermined number 
ax a time, through the use of labelled adaptors specific for the bases, and a defined 
exonuclease cleavage. 

A consequence of sequencing using non-destructive methods is that it is possible : 
to form a spatially addressable array for further characterisation studies, and therefore 
non-destructive sequencing may be preferred In this context, term "spatially > 
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addressable" is used herein to describe how different molecules may be identified on the 
basis of ?beir position on an array. 

Once sequenced, the spatially addressed arrays may be used in a variety of 
procedures which require the characterisation of individual molecules from 
5 heterogeneous populations 

One application is to use the arrays to characterise products synthesised in 
combinatorial chemistry reactions. Puring combinatorial synthesis reactions, it is usual 
for a tag or label to be incorporated onto a beaded support or reaction product for the 
subsequent characterisation of the product This is adapted in the present invention by 
10 using polynucleotide molecules as the tags, each polynucleotide being specific for a 
particular product, and using the tags to hybridise onto a spatially addressed army 
Because the sequence of each arrayed polynucleotide has been determined previously, 
the detection of an hybridisation event on the array reveals the sequence of the 
complementary tag on the product. Having identified the tag, it is then possible to 
15 confirm which product this relates to. The complete process is therefore quick and 
simple, and the arrays may be reused for high through-put screening. Detection may be 
carried out by attaching a suitable label to the product, e.g. a fluoropbore. 

Combinatorial chemistry reactions may be used to synthesise a diverse range of - ? 
different molecules, each of which may be identified using the addressed airays of the 
20 present invention For example, combinatorial chemistry may be used to produce 
therapeutic proteins or peptides that can be bound to the arrays to produce an addressed 
array of target proteins. The targets may then be screened for activity, and those proteins 
exhibiting activity may be identified by their position on the array as outlined above- 
Similar principles apply to other products of combinatorial chemistry, for example 
25 the synthesis of non-polymeric molecules of m.wr <1000 Methods for generating 
peptides/proteins by combinatorial methods are disclosed inUS-A-5643768 and US-A- 
5658754. Split-and-mix approaches may also be used, as described in Nielsen 
Am Chem Soc (1993) 1 15.9812-9813 

In an alternative approach, the products of the combinatorial chemistry reactions 
3 o may comprise a second polynucleotide tag not involved in the hybridisation to the array 
After formation by hybridisation, the array may be subjected to repeated polynucleotide 
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sequencing to identify The second tag which remains free. The sequencing may be carried 
out as described previously 

Theretbre, in this application, it is the tag that provides the spatial address on the 
array. The tag may then be removed from the product by, for example, a cleavabje 
5 linker, to leave an untagged spatially addressed array. 

A farther application is to display proteins via an immobilised polysome 
containing trapped polynucleotides and protein in a complex, as described in 
US-A-5643768 and US-A~5658754 

In a separate embodiment of the invention, the arrays maybe used to characterise 
10 an organism For example, an organism's genomic ONA may be screened using the 
arrays, to reveal discrete hybridisation patterns that are unique to an individual. This 
embodiment may therefore be likened to a"bar code' 1 for each organism. The organism's 
genomic PNA may be first fragmented and detectabiy-iabelled, for example with a 
fluorophore. The fragmented DNA is then applied to the array under hybridising 
15 conditions and any hybridisation events monitored. 

Alternatively, hybridisation may be detected using an in-built fluorescence based 
detection system in the anayed molecule, for example using the "molecular beacons" 
described in Nature Biotechnology (1996) 14.303-308 

it is possible to design the arrays so that the hybridisation pattern generated is 
20 unique to the organism and so could be used to provide valuable information on the 
genetic character of an individual This may have many useful applications in forensic 
science. Alternatively, the methods may be carried out for the detection of mutations or 
allelic variants within the genomic DNA of an organism. 

For genotyping, it is desirable to identify if a particular sequence is present in the 

2 5 genome The smallest possible unique oligomer js a 16-mer (assuming randomness 6f 

the genome sequence), i.e. statistically there is a probability of any given 16-base 
sequence occurring only once in the human genome (which has 3 x 1 0* bases) There are 
c.4 x 10 s * possible 16-mers which would fit within a region of 2 cmxlcm (assuming a 
single copy at a density of 1 molecule per 250 nm x 250 nm square). It is therefore 

3 0 necessary to determine only if a particular 16-mer is present or not, and so quantitative 

measurements are unnecessary. Identifying a mutation in a particular region and what ■"■ 
the mutation is can be canied out using the 16-mer library Mapping back onto the 
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human genome would be possible using published data and would not be a problem once 
the entire genome has been determined There is built-in self-check, by looking at the 
hybridisanon to particular 36-mers so that if there is a single point mutation, this will 
show up in 16 different 16-mers, identifying a region of 32 bases in the genome (the 
5 mutation would occur at the top of one 16-mer and then at the second base in a related 
16-mer etc) Thus, a single point mutation would result in 16 of the 16- mers not 
showing hybridisation and a new set of 16 showing hybridisation plus the same thing for 
the complementary strand In summary, considering both strands of DNA, a single point 
mutation would result in 32 of the 16-mers not showing hybridisation and 32 new 16- 
iq mers showing hybridisation, i e quite large changes on the hybridisation pattern to the 
anay. 

By way of example, asampleofhumangenomicDISAmay be restriction-digested 
to generate short fragments, then labelled using a fluorescently-labefled monomer and a 
DNA polymerase or a terminal transferase enzyme. This produces short lengths of 
1 5 sample DMA with a fluorophore at one end. The melted fragments may then be exposed 
to the atray and the pixels where hybridisation occurs or not would be identified This 
produces a genetic bar code for the individual with (if oligonucleotides of length 16 were 
used) c 4 x 10 v binary coding elements This would uniquely define a person's genotype 
for pharmagenomic applications Since the arrays should be reusable, the same process 
2 0 could be repeated on a different individual. 

In one embodiment of the invention, a method for detennining a single nucleotide 
polymorphism (SNF) present in a genome comprises immobilising fragments of the 
genome onto the surface of a solid support to form an array as defined above, identifying 
nucleotides at selected positions in the genome, and comparing the results with a known 

2 5 consensus sequence to identify any differences between the consensus sequence and the 

genome Identifying the nucleotides at selected positions in the genome may be carried 
out by contacting the array sequentially with each of the bases A, T, G and C t under ; 
conditions thai permit the polymerase reaction to proceed, and monitoring 'the 
incorporation of a base at selected positions in the complementary sequence. 

3 o The fragments of the genome may be unamplified DNA obtained from several 

cells from an individual, which is treated with a restriction enzyme. As indicated above; " 
it is not necessary to determine the sequence of the full fragment. For example, it may/ 
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be preferable to determine the sequence of 16-30 specific bases, which is sufficient to 
identify the DNA fragment by comparison to a consensus sequence, e g. to that known 
from the Human Genome Project. Any SNP occurring within the sequenced region can 
then be identified. The specific bases do nor have to be contiguous For example, the 
procedure may be carried out by the incorporation of non-labelled bases followed, at pre- 
determined positions, by the incorporation of a labelled base. Provided that the sequence 
of sufficient bases is determined, it should be possible to identify the fragment Again, 
any SNPs occurring at the determined base positions, can be identified, for example, the 
method may be used to identify SNPs that occur after cytosine. Template DMA 
(genomic fragments) can be contacted with each of the bases A, T and G, added 
sequentially or together, so that the complementary strand is extended up to a position 
that requires C Non-incorporated bases can then be removed from the array, followed 
by the addition of C. The addition of C is followed by monitoring the next base 
incorporation (using a labelled base) By repeating this process a sufficient number of 
times, a partial sequence is generated where each base immediately following a C is 
known It will then be possible to identify the full sequence, by comparison of the partial 
sequence to a reference sequence It will then also be possible to determine whether 
there are any SNPs occurring after any C 

To further illustrate this, a device may comprise 1 0 7 restriction fragments per cm 2 / 
if 30 bases are determined for each fragment, this means 3 x 10 s bases are identified. 
Statistically, this should determine 3 x 10 s SNPs for the experiment. If the fragments 
each comprise 1000 nucleoudes, it is possible to have 10*° nucleotides per cm 3 , or three 
copies of the human genome The approach therefore permits large sequence or SNP 
analysis to be performed 

Viral and bacterial organisms may. also be studied, and screening nucleic acid 
samples may reveal pathogens present in a disease, or ideutdy microorganisms in 
analytical techniques. For example, pathogenic or other bacteria may be identified usin$* 
a series of single molecule DNA chips produced from different strains pfbacteria Again; 
these chips are simple to make and reusable. 

In a further example, double-stranded arrays may be used to screen protein 
libraries for binding, using fluorescently labelled proteins This may determine proteins i 
that bind to a particular DNA sequence, i e proteins that control transcription. Once the 
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short sequence that the protein binds to has been determined, it may be made and affinity 
purification used to isolate and identify the protein. Such a method could find all the 
transcription-controlling proteins One such method is disclosed in Nature 
Biotechnology (1999) 17.573-577. 

Another use is in expression monitoring. For this, a label is required for each 
gene. There are approximatejyl00 a 000 genes in the human genome There are 262,144 
possible 9- mers, so this is the minimum length of oligomer needed to have a unique tag 
for each gene This 9-mer label needs to be at a specific point in The DN A and the best 
point is probably immediately after the poly-A tail in the mRNA {tea 9-mer linked to 
a poly-T guide sequence). Multiple copies of these 9-mers should be present, to permit 
quantitation of gene expression 100 copies would allow determination of relative 
expression from 1-100% 10,000 copies would allow determination of relative gene 
expression from .01-1 00% 10,000 copies of 262, J44 9-mers would fit inside Icmx l 
cm at close to maximum density. 

The use of nanovials in conjunction with any of the above methods may allow a 
molecule to be cleaved from the surfoce, yet retain its spatial integrity. This permits the 
generation of spatially addressable arrays of single molecules in free solution, which may 
have advantages where the surface attachment impedes the analysis (e.g drug screening) 
A nanovial is a small cavity in a flat glass surface, e g. approx 20 pm in diameter and 10 
\im deep They can be placed every 50 ^m, and so the array would be less dense than 
a surface-attached array; however, this could be compensated for by appropriate 
adjustment in the imaging optics. 

The following Examples illustrate the invention, with reference to the 
accompanying drawings. 
Example 1 

The microscope set-up used in the following Example was based on a modified 
confocal fluorescence system using a photon detector as shown in Figure 1 . Briefly, a 
narrow, spatially filtered laser beam (CW Argon Ion J-aser Technology RPC50) was 
passed through an acousto-optic modulator (AOM) (a A Opto-Electronic) which acts 
as a fast optical switch The acousto-optic modulator was switched on and the laser" 
beam was directed through an oil emersion objective (100 X, NA = I 3) of an inverted 
optical microscope (Nikon Diaphot 200) by a dichroic beam splitter (540E>RIJp02 or 
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505DRLP02, Omega Optics Ine ) The objective focuses the light To a diffraction-hmited 
spot on the Target sample immobilised on a thin glass coversiip Fluorescence from the 
sample was collected by the same objective, passed through the dichroic beam splitter 
and directed through a 50 pinhole (Newport Corp.) placed in the image plane of the 
5 microscope observation port The pinhole rejects light emerging from the sample which 
is our of the plane of xhe laser focus. The transmitted fluorescence was separated 
spectrally by a dichroic beam splitter into red and green components which was filtered 
to remove residual laser scatter The remaining fluorescence components were then 
focused onto separate single photon avalanche diode detectors and the signals recorded 
l o onto a multichannel scalar (MCS) (MCS-Plus, EG 8c G Onec) with time resolutions in 
the 1 to 10 ms range. 

The target sample was a S'-biotin-modified 13-mer primer oligonucleotide 
prepared using conventional phosphoramidite chemistry, and having SEQ IDNo 1 (see 
listing, below). The oligonucleotide was post-synthetically modified by reaction of the 
15 uridine base with the succinundyl ester of tetramethylrhodamine (TMR). 

Glass coverslips were prepared by cleaning with acetone and drying under 
nitrogen. A 50 ^1 aliquot of biotin-BS A (Sigma) redissolved in PBS buffer (0.0 1 M, pH 
7.4) at 1 mg/rol concentration was deposited on the clean coversiip and incubated for 8 
hours at 30°C. Excess biotin-BSA was removed by washing 5 times with MilliQ water 
20 and drying under nitrogen. Non-fluorescent streptavidiniunctionalised polystyrene latex : 
microspheres of diameter 500nra (Poly sciences Tnc ) were diluted hUOOmMNaCltoO 1 
solids and deposited as a 1 jJ drop on the bioiinylated coversiip surface. The spheres 
were allowed to dry fiar one hour and unbound beads removed by washing 5 times with 
MilliQ water This procedure resulted in a surface coverage of approximately 1 
25 sphere/100 pmx 100 pm 

The non-fluorescent microspheres were found to have a broad residual 
fluorescence at excitation wavelength 514nm, probably arising from small quantities of 
photoactive constituents used in the colloidal preparation of the microspheres. The 
microspheres were therefore photobleached by treating the prepared coversiip in a laser 
30 beam of a frequency doubled (532nm) Nd.YAG pulsed dye laser, for 1 hour. 

The biotinylated 13-TMR ssDNA was coupled to the streptavidin functionalised 
microspheres by incubating a 50 pi sample of 0. 1 pM DMA (diluted in 100 mM Natl, 
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100 raM Tris) deposited over the microspheres. Unbound DNA was removed by 
washing the coversKp surface 5 times -with MUliQ water, 

i-ow light level illumination from the microscope condenser was used to position 
visually a microsphere at J Ox magnification so thai when the laser was switched on the 
5 sphere was located in the centre of the difBraction limited focus The condenser was then 
turned off and the light path switched to the fluorescence detection port The MCS was 
initiated and the fluorescence omitted from the latex sphere recorded on one or both 
channels The sample was excited at 5l4nm and detection was made on the 600nm 
channel 

10 Figure 3 shows clearly that the fluorescence is switched on as the laser is 

deflected into the microscope by the AOM, 0 5 seconds after the start of a scan The 
intensity of the fluorescence remains relatively constant for a short period of time (100 
ms-3s) and disappears in a angle step process The results show that single molecule 
detection is occurring This single step photobleacbing is unambiguous evidence that the 

1 5 fluorescence is from a single molecule, 
example 2 

This Example illustrates the preparation of single molecule arrays by direct 
covalent attachment to glass followed by a demonstration of hybridisation to the atray. 
CovaJently modified slides were prepared as follows Spectrosil-2000 slides 

20 (TSL, UK) were rinsed in milli-Q to remove any dust and placed wet fa a bottle 
containing neat Decon-90 and left for 12 h at room temperature. The slides were rinsed 
with milli-Q and placed in a bottle containing a solution of 15% 
glycidoxypropyltrimethoxy-silane in milli-Q and magnetically stirred for 4 h at room 
temperature riased with milii-Q and dried under N 2 to liberate an epoxide coated surface. 

2 5 The DNA used was that shown in SEQ IP No. 2 (see sequence listing below), 

where c represents a 5-methyl cytosine (CyS) with a TMR group coupled via a linker t o 1 
the n4 position 

A sample of this (5 H l, 450 pM) was applied as a solution in neat milli-Q : 
The DNA reaction was left for 1 2 h at room temperature in a humid atmosphere 
30 to couple to the epoxide surface The slide was then rmsed with milli-Q and dried under 

N 2 . 
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The prepared slides can be stored wrapped in foil in a desiccator for at least a 
week without any noticeable contamination or loss of bound material. Control DNA of 
the same sequences and fluorophore but without the 5-amino group shows little stable 
coverage when applied at the same concentration. 
5 The TMR labelled slides were then treated with a solution of complementary 

DNA (SEQ ID No. 3) (5>*M, 10^1) in lOOmM PBS The complementary DM A has the 
sequence shown in SEQ ID No. 3, where n represents a metbylcytosine group 

After 1 hour at room temperature the slides were cooled to 4°C and left for 24 
hours Finally, the slides were washed in PBS (iOOmM, ImL) and dried under N 2 

xo A chamber was constructed on the slide by sealing a coverslip (No 0, 22x22mm, 

Chance Propper Ltd, UIC) over the sample area on two sides only with prehardened 
microscope mounting medium (Hukitt, O. Kindler GmbH & Co , Freiburg, Germany) 
whilst maintaining a gap of less than 200^m between slide and coverslip The chamber 
was flushed 3x with IGO^l PBS (lOOmM NaCl) and allowed to stabilise for 5 minutes 

15 before analysing on a fluorescence microscope 

The slide was inverted so that the chamber coverslip contacted the objective lens 
of an inverted microscope (Nikon TE200) via an immersion oil interface. A 60° fused 
silica dispersion prism was optically coupled to the back of the slide through a thin film 
of glycerol Laser light was directed at the prism such that at the glass/sample interface 

20 it subtends an angle of approximately 68° to the normal of the slide and subsequently 
undergoes Total Internal Reflection (T1R) The critical angle for glass/water interface 
is 66° 

Fluorescence from angle molecules of DNA-TMR or DNA-Cy5 produced by 
excitation with the surface specific evanescent wave following T1R is collected by the 

2 5 objective lens of the microscope and imaged onto <pi intensified Charge Coupled Device 

(1CCD) camera (Pentamax, Princeton Instruments, NJ) Two images were recorded 
using a combination of 1) 532nm excitation (frequency doubled solid state Md YAG a 
Antares, Coherent) with a 580nm fluorescence (580PF30, Omega Optics, USA) filter 
for TMR and 2) 630wn excitation (nd YAG pumped dye laser, Coherent 700) with a 

3 0 670um filter {670PF40, Omega Optics, USA) for Cy5 images were recorded with an 

exposure time of 500ms at the maximum gain of 10 on the ICCD. Laser power? incident 
at the prism were 50mW and 40mW at 532nm and 630nm respectively A third image 




was taken with 532nm excitation and detection ai 670nm to determine the level of cross- 
talk from TMR on the Cy5 channel 

Single molecules were identified by single points of fluorescence with average 
intensities greater than 3x that of the background. Fluorescence from a single molecule 
5 is confined to a few pixels, typically a 3x3 matrix at 1 00x magnification, and has a narrow 
Gaussian-like intensity profile Single molecule fluorescence is also characterised by a 
one-step photobleacbing process in the time course of the intensity and was used to 
distinguish single molecules from pixel regions containing two or more molecules, which 
exhibited multi-step processes. Figures 4a and 4b show 6G^m x 60}im fluorescence 
1 0 images from co valemiy modified slides withQN A-TMR starting concentrations of 45pM 
and 450pM Figure 4c shows a control slide which was treated as above but with GNA~ 
TMR Jacking the 5' amino modification. 

To count molecules, a threshold for fluorescence intensities is first set to exclude 
background noise, for a control sample, the background is essentially the theimal noise 
15 of the ICCD measured to be 76 counts with a standard deviation of only 6 counts. A 
threshold is arbitrarily chosen as a linear combination of the background, the average 
counts over an image and the standard deviation over an image. In general, the latter two 
quantities provide a measure of the number of pixels and range of intensities above 
background This method gives rise to threshold levels which are at least J2 standard 
30 deviations above the background with a probability of less than I in 144 pixels 
contributing from noise By defining a single molecule fluorescent point as being at least 
a 2x2 matrix of pixels and no larger than a 7x7, the probability of a single background 
pixel contributing to the counting is eliminated and clusters are ignored. 

In this manner, the surface density of single molecules of ONA-TMR is measured 
25 at 2.9x10* molecules/cm 2 (238 molecules in Figure 4a) and 5 8xl0 6 molecules/cm 2 (469 
molecules in Figure 4b) at 45pM and 45GpM ONA-TMR coupling concentrations. The 
density is clearly not directly proportional to DNA concentration but will be some 
function of the concentration, the volume of sample applied, the area covered by the 
sample and the incubation time. The percentage of non-specifically bound DNA-TMR 
30 and impurities contribute of the order of 3-9% per image (8 non-specifically bound 
molecules in Figure 4c). Analysis of the photobleacbing profiles shows only 6% of 
fluorescence points contain more than 1 molecule 



Hybridisation was identified by the co-localisaiion of discreet points of 
fluorescence from single molecules of TMR and Cy-5 following the superposition of two 
images. Figures 5a and 5b show images of surface bound 20-mer labelled with TMR and 
the complementary 20-mer labelled with Cy-5 deposited from solution Figure 5d shows 
those fluorescent points that arc co-localised on the two former images The degree of 
hybridisation was estimated to be 7% of the surface- bound DNA (1 0 co-localised points 
in 141 points from figures 5d and 5a, respectively) The percentage of hybridised DNa 
is estimated to be 37% of all surface-adsorbed DNA-Cy5 (10 co-localised points in 27 
points from Figures 5d and 5b, respectively) Single molecules were counted by 
matching size and intensity of fluorescent points to threshold criteria winch separate 
single molecules from background noise and cosmic rays. Figure 5d shows the level of 
cross-talk from TMR on rhe Cy5 channel which is 2% as determined by counting only 
those fluorescenr points which fell within the criteria for determining the TMR single 
molecule fluorescence (2 fluorescence points in 141 points from Figures 5c and 5a, 
respectively) 

This Example demonstrates that single molecule arrays can be formed, and 
hybridisation events detected according to the invention. It is expected that the skilled 
person will reahse that modifications may be made to improve the efficiency of the 
process For example, improved washing steps, e g using a flow cell, would reduce 
background noise and permit more concentrated solutions to be used, and hybridisation 
protocols could be adapted by varying rhe parameters of temperature, buffer, time etc 
Example 3 

This experiment demonstrates the possibility of performing enzymatic 
incorporation on a single molecule array In summary, primer DNA was attached to the 
surface of a solid support, and template DMA hybridised thereto Two cycles of 
incorporation of fluorophore-labelled nucleotides was then completed. This was 
compared against a reference experiment where the immobilised DNA was pre-Jabelled 
with the same two fluorophores prior to attachment to the surface, and contrpl 
experiments performed under adverse conditions for nucleotide incorporation. 

The primer DNA sequence and the template DNA sequence used in this 
experiment are shown in SEQ IDNOS. 4 and 5, respectively 
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The bufifer used contained 4 mM MgCl 2 , 2mM DTT, 50 mM Tris HCI (pH 7 6) 
10 mM NaCl and 1 mm KzPOj (100 H D 
Preparation of Slides 

Silica slides were Treated with decon for ai least 24 hours and rinsed in water and 
5 EtOH directly before use. The dried slides were placed in a 50 ml solution of 2% 
glycidoxypropylrnmethoxysilane in EtOH/H 2 S0 4 (2 drops/500 ml) at room temperature 
for 2 hours. The slides were then rinsed in EtOH from a spray bottle and dried under N 2 

The DNA samples (SEQ ID NO 4) were applied either as a 40- 1 00 pM solution 
(5 \d) in 1 0 mM KJPQ 3 pH 7 6 (allowed to dry overnight), or at least 1 concentration 

1 o over a sealed slide. The slides allowed to dry overnight were left over a layer of water 

for 18 hours at room temperature and then nnsed with milli-q (approx. 30 ml from a 
spray bottle) and dried under N 2 The sealed slides were simply flushed with 50 ml buffer 
prior to use Control slides with no coupled DNA were simply left under the buffer for 
identical time periods 

15 Enzyme Sxremions on a Surface 

For the first incorporation cycle, samples were prepared with the buffer 
containing BSA (to 0.2 mg/ral), the triphosphate (Cy3dUTP; to 20 pM) and the 
polymerase enzyme (T4 exo-; to 500 nM) tn certain experiments, the template DNA 
was also added at 2^M. The mixture was flowed into cells which were incubated at 

20 37°C for 2 hours and flushed with 500 ml buffer. The second incorporation cycle with 
Cy5dCTP (20 nM)> dATP (100 ^m) and dGTP (100 |*M) was performed in the same 
way The cells were flushed with 50 ml buffer and left for 12 hours prior to imaging. 
Control reactions were performed as above with, a) no DNA coupled prior to extension; 
b) DNA attached but no polymerase in the extension buffer, and c) DNA attached, but 

2 5 the polymerase denatured by boiling ^ 

Reference Sample 

A reference sample, not immobilised to the surface, was prepared in the following 
way. -*rV 

Buffer containing I \M of the sample DNA BSA (0.2 mg/ml), TMR-Jabelled 
3 o dUTP (20 ^M) and the polymerase enzyme (T4 exo-, 500 nM; J0O jd) was prepared 

The reaction was analysed and purified by reverse phase HPtC (5-30%: 
acetonitrile in ammonium acetate over 30 min.) with UV and fluorescence detection In 
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all cases, The labelled DNA was cleariy separate from both the unlabeled DNA and The 
labelled dNTP's The material was concentrated and dissolved in 10 mM KjPOa for 
analysis by A260 and fluorescence The material purified by HPLC was further extended 
wth labelled dCXP (20 H M), dATP(100 pM) anddOTP(100 p J4) and HPLC purified 
5 again Surface coupling was then performed diy, at 100 pM concentrations. 
Microscopic Analysis 

Following the single molecule DNA attachment procedure and extension 
reactions, the sample cells were analysed on a single molecule total internal reflection 
fluorescence microscope (T JRFM) in the following manner A 60 ° fused silica dispersion 

X o prism was coupled optically to the slide through an aperture in the cell via a thm film of 
glycerol Laser light was directed at the prism such that at the glass/sample interface it 
subtends an angle of approximately 6S° to the normal of the slide and subsequently 
undergoes total internal reflection The critical angle for a glass/water interface is 66° 
An evanescent field is generated at the interface which penetrates only - 1 50 nm into the 

1 5 aqueous phase. Fluorescence from single molecules excited within this evanescent field 
is collected by a 100X objective lens of an inverted microscope, filtered spectrally from 
the laser light and imaged onto an Intensified Charge Coupled Device (ICCD) camera 
Two 90 nm x 90 pm images were recorded using a combination of 1) 532 nm 
excitation (frequently doubled Nd YAG) with a 580 nm interference filter for Cy3 

2 0 detection, and 2} 630 nm excitation (Md YAG pumped DCM dye laser) with a 670 nm 
filter for Cy 5 detection Images were recorded with an exposure time of 500 ms ar the 
maximum 1CCD gam of 5 . 75 counts/photoelectron Laser powers incident at the prism 
were 30 mW and 30 mW at 532 nm and 630 nm respectively Two colour fluorophore 
labelled nucleotide incorporations are identified by the co-localisation of discreet points 

2 5 of fluorescence from single molecules of Cy3 and Cy5 following superimposing the two 
images Molecules are considered co-localised when fluorescent points are within a pixel 
separation of each other. For a 90 pm x 90 pm field, projected onto a CCD array of 
512x512 pixels, the pixel size dimension is 0 176 
Results 

i o The results of the experiment are shown in Table 1 The values shown areian ; 

average of ihe number of molecules imaged (Cy3 and CyS) over all frames (100 in each) . 
compiled in each experiment and the number of those molecules which are co-localised. 
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The final column represents the number of co-localised molecules expected if the two 
fluorophores were randomly dispersed across the sample slide (iV ~ %Ar where n is the 
surface density of molecules and = 0 1 76 is the minimum measurable separation) 
The number in brackets indicates the magnitude by which The level of co-locations in each 
5 experiment is greater than random 



Table i 



System 


Cy3 


Cy5 


Co-local 


% of Total 


Random 


Reference 


30 


36 


3 


8 


0 05 (xlOO) 


Incorporation A 


75 


75 


12 


8 


0 3 (x40) 


Incorporation B 


354 


570 


76 


S 


I0(x7.6) 


NoDNA 


no 


280 


9 


2 


2(x3 5) 


No Enzyme 


26 


332 


3 


1 


J 5 (x2) 


Denatured T4 


89 


624 


18 


2.5 


6(x3) 



IS 

The percentage of co-localisation observed on this sample represents the 
maxinnim measurable for a dual labelled system, i e. there is a detection ceiling due to 
photophysical effecrs which means the level is not J 00%. These effects may arise from 
interactions of the fluorophores with the A or the surface or both. 

20 There is a statistically higher level of co-localisation in the incorporation 

experiments compared to the controls (8% versus 2% respectively) This shows that it 
is possible to perform en2ymatic incorporation on the SMA and the level of incorporation 
is close to that of the reference sequence. Improvements in the surface attachment and 
the nature of the surface are required to increase the level of co-localisation in the 

25 reference and to increase the detection efficiency of the enaymatic incorporation 

Example 4 v; ,V 

This Example illustrates the preparation of single molecule arrays by direct 
covalent attachment of hairpin loop structures to glass 

A solution of 1% glycido>cypropylmmethoxy-silaiie in 95% eihanol/5% water 

3 o with 2 drops H 2 S0 4 per 500 ml was stirred for 5 minutes at room temperature Clean, 
dry Spectrosil-2000 slides (TSL, UK) were placed in the solution and the stirring ; 
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sropped After 1 hour the slides were removed, rinsed with ethanol, dried under N 2 and 
oven-cured for 30 min atl00°C. These 'epoxide' modified slides were then treated with 
J hM of labelled DNA (5'-Cy3-CTGCTGAAGCGTCGQCAGGT-beg^inodT-hefi- 
ACCTGCCG ACGCT-3') (SEQ IP NOS. 6 and 7) in 50 mM potassium phosphate buffer, 
5 pH 7.4 for 18 hours at room temperature and, prior to analysis, flushed with 50 mM 
potassium phosphate, 1 mMEDTA, pH7A The coupling reactions were performed in 
sealed teflon blocks under a pre-mounied coverslip to prevent evaporation of the sample 
and allow direct imaging 

The DNA structure was designed as a self-priming template system with an 

20 internal amino group attached as an amino deoxy-thymidine held by two IS atom 
hexaethylene glycol (heg) spacers, and was synthesised by conventional DNA synthesis 
techniques using phosphoramidite monomers. 

For imaging, one slide was inverted so that the chamber coverslip contacted the 
objective lens of an inverted microscope (Nikon TE200) via an immamaxx oil interface 

15 A 60° tused silica dispersion prism was coupled optically to the back of the slide through 
a thin film of glycerol Laser light was directed at the prism such that at the glass/sample 
interface it subtends an angle of approximately 68° to the normal of the slide and 
subsequently undergoes Total internal Reflection (TIR) The critical angle for 
glass/water interface is 66° . 

2 0 Fluorescence from single molecules of ON A-Cy 3, produced by excitation with 

the surface-specific evanescent wave following TIR, was collected by the objective lens 
of the microscope and imaged onto an Intensified Charge Coupled Device (ICGD) 
camera (Pentamax, Princeton Instruments, NJ). The image was recorded using a 532nm 
excitation (frequency-doubled solid-state Nd.YAG, Antares, Coherent) with a 580nro 

2 5 fluorescence (580DF30, Omega Optics, USA) flter for Cy3 Images were recorded with 
an exposure time of 500ms at the maximum gain of 10 on the ICCD. Laser powers 
incident at the prism were 50mW at 532nm 

Single molecules were identified as described in Example 2 

The surface density of single molecules of DNA-Cy3 was measured at 

30 approximately 500 per 100 ^on x 100 ^m image or 5 x 10 6 cm 2 



